Terminate pool when timeout is reached for parallel tests.#53860
Merged
potiuk merged 1 commit intoapache:mainfrom Aug 1, 2025
Merged
Terminate pool when timeout is reached for parallel tests.#53860potiuk merged 1 commit intoapache:mainfrom
potiuk merged 1 commit intoapache:mainfrom
Conversation
Member
Author
|
The origin of this PR - when trying to diagnode Sqlallchemy 2 CI #52233 it turned out that when things timed-out for all tests, the log output has not been printed . |
85291ef to
ff08b2b
Compare
gopidesupavan
approved these changes
Jul 29, 2025
Member
|
Nice Thanks for the update :) LGTM |
aritra24
approved these changes
Jul 29, 2025
0b167d6 to
ad1d84a
Compare
bugraoz93
approved these changes
Jul 30, 2025
Contributor
|
Great! |
When we reach timeut we kill all the hanging containers already and after the pool has been terminated, we will print all the logs. However, when the pool had not yet been fully executing (i.e the containers were hanging and some tasks were not started) - without terminating the pool that would kill running containers and the remaining tasks would start new ones. This PR changes the timeout handler to terminate the pool before attempting to kill all the containers. It also turned out that exit handling by the main thread monitorint the tests in this case would hang rather than print logs: * it was waiting in a loop to wait for all task to complete (which would never happen) * it was trying to retrieve result from ApplyResult without timeout where it would hang for ever for terminated tasks This PR introduces a separate path to handle timeout, which does not wait for those two and handles timeout immediately. It also refactors the whole "end of tests" method splitting it into several methods to make it easier to reason and read.
ad1d84a to
0364bd0
Compare
Backport failed to create: v3-0-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker e8d424e v3-0-testThis should apply the commit to the v3-0-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
ferruzzi
pushed a commit
to aws-mwaa/upstream-to-airflow
that referenced
this pull request
Aug 7, 2025
) When we reach timeut we kill all the hanging containers already and after the pool has been terminated, we will print all the logs. However, when the pool had not yet been fully executing (i.e the containers were hanging and some tasks were not started) - without terminating the pool that would kill running containers and the remaining tasks would start new ones. This PR changes the timeout handler to terminate the pool before attempting to kill all the containers. It also turned out that exit handling by the main thread monitorint the tests in this case would hang rather than print logs: * it was waiting in a loop to wait for all task to complete (which would never happen) * it was trying to retrieve result from ApplyResult without timeout where it would hang for ever for terminated tasks This PR introduces a separate path to handle timeout, which does not wait for those two and handles timeout immediately. It also refactors the whole "end of tests" method splitting it into several methods to make it easier to reason and read.
fweilun
pushed a commit
to fweilun/airflow
that referenced
this pull request
Aug 11, 2025
) When we reach timeut we kill all the hanging containers already and after the pool has been terminated, we will print all the logs. However, when the pool had not yet been fully executing (i.e the containers were hanging and some tasks were not started) - without terminating the pool that would kill running containers and the remaining tasks would start new ones. This PR changes the timeout handler to terminate the pool before attempting to kill all the containers. It also turned out that exit handling by the main thread monitorint the tests in this case would hang rather than print logs: * it was waiting in a loop to wait for all task to complete (which would never happen) * it was trying to retrieve result from ApplyResult without timeout where it would hang for ever for terminated tasks This PR introduces a separate path to handle timeout, which does not wait for those two and handles timeout immediately. It also refactors the whole "end of tests" method splitting it into several methods to make it easier to reason and read.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When we reach timeut we kill all the hanging containers already and after the pool has been terminated, we will print all the logs.
However, when the pool had not yet been fully executing (i.e the containers were hanging and some tasks were not started) - without terminating the pool that would kill running containers and the remaining tasks would start new ones.
This PR changes the timeout handler to terminate the pool before attempting to kill all the containers.
This PR changes the timeout handler to terminate the pool before
attempting to kill all the containers.
It also turned out that exit handling by the main thread monitorint
the tests in this case would hang rather than print logs:
it was waiting in a loop to wait for all task to complete (which
would never happen)
it was trying to retrieve result from ApplyResult without timeout
where it would hang for ever for terminated tasks
This PR introduces a separate path to handle timeout, which does
not wait for those two and handles timeout immediately. It also
refactors the whole "end of tests" method splitting it into several
methods to make it easier to reason and read.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.